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Abstract 

Given n (discrete or continuous) random variables Xi, the (2 n — 1) -dimensional vector obtained by evaluating 
the joint entropy of all non-empty subsets of {Xi, . . . , X n } is called an entropic vector. Determining the region of 
entropic vectors is an important open problem with many applications in information theory. Recently, it has been 
shown that the entropy regions for discrete and continuous random variables, though different, can be determined 
from one another. An important class of continuous random variables are those that are vector-valued and jointly 
Gaussian. It is known that Gaussian random variables violate the Ingleton bound, which many random variables such 
as those obtained from linear codes over finite fields do satisfy, and they also achieve certain non-Shannon type 
inequalities. In this paper we give a full characterization of the convex cone of the entropy region of three jointly 
Gaussian vector-valued random variables and prove that it is the same as the convex cone of three scalar-valued 
Gaussian random variables and further that it yields the entire entropy region of 3 arbitrary random variables. We 
further determine the actual entropy region of 3 vector- valued jointly Gaussian random variables through a conjecture. 
For n > 4 number of random variables, we point out a set of 2" — 1 — n (" 2 +1 ) minimal necessary and sufficient 
conditions that 2™ — 1 numbers must satisfy in order to correspond to the entropy vector of n scalar jointly Gaussian 
random variables. This improves on a result of Holtz and Sturmfels which gave a nonminimal set of conditions. These 
constraints are related to Cayley's hyperdeterminant and hence with an eye towards characterizing the entropy region 
of jointly Gaussian random variables, we also present some new results in this area. We obtain a new (determinant) 
formula for the 2x2x2 hyperdeterminant and we also give a new (transparent) proof of the fact that the principal 
minors of an n x n symmetric matrix satisfy the 2 x 2 x . . . x 2 (up to n times) hyperdeterminant relations. 

I. Introduction 

Obtaining the capacity region of information networks has long been an important open problem. It turns out that 
there is a fundamental connection between the entropy region of a number of random variables and the capacity 
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region of networks [3 1 [4|. However determining the entropy region has proved to be an extremely difficult problem 
and there have been different approaches towards characterizing it. While most of the effort has been towards 
obtaining outer bounds for the entropy region by determining valid information inequalities 10, 0, Q, 0, |9], 
ifTol . ifTTl . |[T2l some have focused on innerbounds [13], [14|, [15] which may prove to be more useful since they 
yield achievable regions. 

Let X±, ■ ■ ■ ,X n be n jointly distributed discrete random variables with arbitrary alphabet size N. The vector 
of all the 2™ — 1 joint entropies of these random variables is referred to as their "entropy vector" and conversely 
any 2™ — 1 dimensional vector whose elements can be regarded as the joint entropies of some n random variables, 
for some alphabet size N, is called "entropic". The entropy region is defined as the region of all possible entropic 
vectors and is denoted by T* Q. Let J\f = {1, ■ • • , n} and s, s' C Af. If we define X s = {X; :iGs} then it is 
well known that the joint entropies H(X S ) (or H s , for simplicity) satisfy the following inequalities: 

1) Hu = 

2) For s C s': H s < H s > 

3) For any s, s': H sUs , + H sns * < H s + H s >. 

These are called the basic inequalities of Shannon information measures and the last one is referred to as the 
"submodularity property". They all follow from the nonnegativity of the conditional mutual information Q, fl6l . 
ifPTl . Any inequality obtained from positive linear combinations of conditional mutual information is called a 
"Shannon-type" inequality. The space of all 2™ — 1 dimensional vectors which only satisfy the Shannon inequalities 
is denoted by r„. It has been shown that = T-2 and = T3 where Tg denotes the closure of TJ (7). However, 
for n > 4, in 1998 the first non-Shannon type information inequality was discovered [7] which demonstrated that 
T| is strictly smaller than T 4 . Since then many other non-Shannon type inequalities have been discovered [18|, [8|, 
[11 1, [12|. Nonetheless, the complete characterization of T* for n > 4 remains open. 

The effort to characterize the entropy region has focused on discrete random variables, ostensibly because the 
study of discrete random variables is simpler. However, continuous random variables are as important, where now 
for any collection of random variables X s , with joint probability density function fx B { x s), me differential entropy 
is defined as 



Let J2s^sH s > be a valid discrete information inequality. This inequality is called balanced if for all i e Af 



information inequalities, which allows us to compute the entropy region for one from the other. 
Theorem 1 (Discrete/continuous information inequalities): 

1) A linear continuous information inequality ^ s ^ s h s > is valid if and only if its discrete counterpart 
J2 S lsH s > is balanced and valid. 

2) A linear discrete information inequality lsH s > is valid if and only if it can be written as J2 S Psh s + 
^27=i r i{^ l i-i c ~ h^) for some ri > 0, where ^2 s (3 s h s > is a valid continuous information inequality (i c 




(1) 



we have 



7 S = 0. Using this notion Chan [19| has shown a correspondence between discrete and continuous 
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denotes the complement of i in AO- 

The above Theorem suggests that one can also study continuous random variables to determine T* . Among all 
continuous random variables, the most natural ones to study first (for many of the reasons further described below) 
are Gaussians. This will be the main focus of this paper. 

Let X\ , • • • , X n 6 R T be n jointly distributed zero-mear[] vector-valued real Gaussian random variables of 
vector size T with covariance matrix R £ R" TxnT . Clearly, R is symmetric, positive semidefinite, and consists of 
block matrices of size T x T (corresponding to each random variable). We will allow T to be arbitrary and will 
therefore consider the normalized joint entropy of any subset s C A/" of these random variables 

&. = i-5bg((27re) T l'ldetJe.), (2) 

where |s| denotes the cardinality of the set s and R s is the T\s\ x T\s\ matrix obtained by keeping those block 
rows and block columns of R that are indexed by s. Note that our normalization is by the dimensionality of the 
Xi, i.e., by T, and that we have used h to denote normalized entropy. 
Normalization has the following important consequence. 

Theorem 2 ( Convexity of the region for h): The closure of the region of normalized Gaussian entropy vectors is 
convex. 

Proof: Let h x and h/ J be two normalized Gaussian entropy vectors. This means that the first corresponds to 
some collection of Gaussian random variables X\,... ,X„ S R Tx with the covariance matrix R x , for some T x , 
and the second to some other collection Yi,...,Y n G R Ty with the covariance matrix R v , for some T y . Now 
generate N x copies of jointly Gaussian random variables X\ , . . . , X n and N y copies of Yi , . . . , Y n and define 



the new set of random variables Z, 



t 

t 



, where (•)* denotes the 



{xif ... (x»*y wy ... (Y^y 

transpose, by stacking N x and N y independent copies of each, respectively, into a N x T x + N y T y dimensional vector. 
Clearly the Zi are jointly-Gaussian. Due to the independence of the X,- and Y t , k = 1, . . . N x , 1 = 1,..., N y , the 
non-normalized entropy of the collection of random variables Z s is 

K - a , /;,//; + N y T y hl 

To obtain the normalized entropy we should divide by N X T X + N y T y 

NT NT 

= l^yly 



NyTy'' N X T X + NyTy 



which, since N x and N y are arbitrary, implies that every vector that is a convex combination of h x and W is 
entropic and generated by a Gaussian. ■ 
Note that h s can also be written as follows: 

h 8 = ^logdeti? s + ^ilog27re (3) 

'Since differential entropy is invariant to shifts there is no point in assuming nonzero means for the Xi. 
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Therefore if we define 



9s = 



— log det K 



(4) 



it is obvious that g s can be obtained from h s and vice versa. All that is involved is a scaling of the covariance 
matrix R. Denote the vector obtained from all entries g s , s C {1, . . . , n} by g. For balanced inequalities there is 
the additional property, 

Lemma 1: If the inequlaity lsH s > is balanced then ^ s \s\j s = 0. 
Proof: We can simply write, 



Therefore the set of linear balanced information inequalities that g and h satisfy is the same. Moreover any other 
type of inequality that h satisfies can be converted to an inequality for g and vice versa and therefore the space 
of g and h can be obtained from each other. For simplicity, we will therefore use g s instead of h s throughout the 
paper and use the term entropy for both g and h interchangeably. 

In this paper we characterize the entropy region of 3 jointly Gaussian random variables and study the minimal 
set of necessary and sufficient conditions for a 2™ — 1 dimensional vector to represent an entropy vector of n 
scalar jointly Gaussian random variables for n > 4. As equation © suggests, the entropy of any subset of random 
variables from a collection of Gaussian random variables is simply the "log" of the principal minor of the covariance 
matrix corresponding to this subset. Therefore studying the entropy of Gaussian random variables involves studying 
the relations among principal minors of symmetric positive semi-definite matrices, i.e., covariance matrices. It has 
recently been noted that one of these relations is the so-called Cayley "hyperdeterminant" |20l . Therefore along 
the study of entropy of Gaussian random variables we also examine the hyperdeterminant relation. 

The remainder of this paper is organized as follows. In the next section we review background and some motivating 
results on the entropies of Gaussian random variables. Section [HI] states the main results on the characterization of 
the entropy region of 3 jointly Gaussian random variables. In Section [IV] we examine the hyperdeterminant relation 
in connection to the entropy region of Gaussian random variables. We give a determinant formula for calculating 
the special 2x2x2 hyperdeterminant. Moreover we present a new and transparent proof of the result of ll20l on 
why the principal minors of a symmetric matrix satisfy the hyperdeterminant relations. In Section [V] we study the 
minimal set of necessary and sufficient condition for a 2" — 1 dimensional vector to be the entropy vector of n 
scalar jointly Gaussian random variables. For n = 4, there are 5 such equations and we explicitly state them. 



From it can be easily seen that any valid information inequality for entropies can be immediately converted 
into an inequality for the (block) principal minors of a symmetric, positive semi-definite matrix. This connection has 
been previously used in the literature. In fact one can study determinant inequalities by studying the corresponding 
entropy inequalities, see e.g. ll2l"l . 




(5) 



II. Some Known Results 
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Let g be the normalized entropy vector corresponding to some vector-valued collection of random variables with 
an nT x nT covariance matrix R. Further, let m denote the vector of block principal minors of R. Then it is clear 
that m — e gT , where the exponential acts component-wise on the entries of g. Then the submodularity of entropy 
translates to the following inequality for the principal minors: 

m sUs / ■ m sns ' < m s ■ m s > (6) 

In the context of determinant inequalities for a Hermitian positive semidefinite matrix this is known as the 
"Koteljanskii" inequality and is a generalization of the "Hadamard-Fischer" inequalities J22|. Dating back at least 
to Hadamard in 1893, studying the determinant inequalities is an old subject which is of interest in its own right 
and has many applications in matrix analysis and probability theory. 

Some of the interesting problems in the area of principal minor relations include characterizing the set of 
bounded ratios of principal minors for a given class of matrices (e.g. the class of positive definite, or the class 
of matrices whose all of their principal minors are positive, i.e., the P matrices) 1231 . ll24ll . studying the Gaussian 
conditional independence structure in the context of probabilistic representations [25] and detecting P matrices, e.g., 
via computation of all the principal minors of a given matrix l26ll . 

Although determinant inequalities have been studied extensively on their own and also through the entropy 
inequalities, the reverse approach of determining Gaussian entropies via the exploration of the space of principal 
minors has been less considered ll25ll . 11271 . As it turns out, this approach is deeply related to the "principal minor 
assignment" problem where a matrix with a set of fixed principal minors is sought. Recently there has been progress 
towards this area for symmetric matrices ||20l , ||28l and we will discuss this in more detail in Sections IPVl and [VI 

Apart from the result of ll27l which shows the tightness of the Zhang-Yeung non-Shannon inequality |[T4ll for 
Gaussian random variables, one of the encouraging results for studying the Gaussian random variables is that they 
can violate the "Ingleton bound". This bound is one of the best known inner bounds for T| lfT4l . 

Theorem 3 (Ingleton inequality): |29l Let v\, ■ ■ ■ ,v n be n vector subspaces and let J\f = {1, • • • ,n}. Further 
let s C J\f and r s be the rank function defined as the dimension of the subspace (BiesVi- Then for any subsets 
si, s 2 , S3, s 4 C J\f, we have 

fsx + ^s 2 + r s 1 Us 2 Us 3 + f Sl us 2 Us 4 + r s 3 US4 
_, 'siUs 2 — r siUs 3 — ?™siUS4 — r s 2 Us 3 — r s 2 Us 4 < (7) 

Ingleton inequality was first obtained for the rank of vector spaces. However it turns out that certain types of 
entropy functions, in particular all linear representable (corresponding to linear codes over finite fields) and pseudo- 
abelian group characterizable entropy functions also satisfy this inequality and hence fall into this inner bound l30ll . 
ll3D . However if we consider 4 jointly Gaussian random variables, we interestingly find that they can violate the 
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feasible region of e and a for Inleton violation 
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Fig. 1. Feasible region of e and a for the specific Ingleton violating example 



Ingleton bound. Consider the following covariance matrix: 

1 e a a 

e 1 a a 

a a 1 

a a 1 

To violate the Ingleton inequality we need to have: 



(8) 



91 + 92 + 9123 + 9124 + 534 

-512 - 313 - 314 - 323 - fl24 > 



or equivalently in terms of the minors m: 



mim 2 mi23mi24™ 3 4 



> 1 



mi 2 mi 3 mi4m 2 3TO24 

Substituting for values of m from the covariance matrix and simplifying we obtain: 

1-e . /l-2a 2 + a 4x 2 



1 + e 



> 



l-2a 2 



(9) 



(10) 



(ID 



Moreover imposing positivity conditions for this matrix to correspond to a true covariance matrix gives < a 2 < 0.5, 
4a 2 — 1 < £ < 1. Solving inequality (fTTT l subject to these constraints yields a region of permissible e and a 2 (Fig 
[TJ. In particular the point e = 0.25, a = 0.5 lies in this region. Interestingly enough, this example has also been 
discovered in the context of determinantal inequalities in |23l . 



7 



Taking these results into account, we will hence study the Gaussian entropy region for 2 and 3 random variables 
and give the minimal number of necessary and sufficient conditions for a 2™ — 1 dimensional vector to correspond 
to the entropy of n scalar jointly Gaussian random variables in the following sections. 

III. Entropy Region of 2 and 3 Gaussian Random Variables 

The mentioned results in the previous section (violation of the Ingleton bound and tightness of the non-Shannon 
inequality) lead one to speculate whether the entropy region for arbitrary continuous random variables is equal to 
the entropy region of (vector-valued) Gaussian ones. Although this is the case for n — 2 random variables, it is not 
true for n = 3. What is true for n = 3 is that the entropy region of 3 arbitrary continuous random variables can be 
obtained from the convex cone of the region of 3 scalar-valued Gaussian random variables. 

A. n = 2 

Entropy region of 2 jointly Gaussian random variables is trivially equal to the whole entropy region of 2 arbitrary 
distributed continuous random variables. 

Theorem 4: The entropy region of 2 jointly Gaussian random variables is described by the single inequality 
<7i2 < Si + 92 and is equal to the entropy region of 2 arbitrary distributed continuous random variables. 

Proof: Since it is known that the continuous entropy region is described by the single balanced inequality 
h\2 < h% + h 2 , to prove the theorem it is sufficient to show that any entropy vector [hi, h 2 , h 12 ] satisfying this 
inequality may be described by 2 jointly Gaussians and this is trivial to show. ■ 

B. Main Results for n = 3 

Although we consider vector-valued jointly Gaussian random variables, for n — 3 we interestingly find that 
considering the convex hull of scalar jointly Gaussian random variables is sufficient for characterizing the Gaussian 
entropy region. 

Theorem 5: The entropy region of 3 vector-valued Gaussian random variables can be obtained from the convex 
hull of scalar Gaussian random variables. 

The main result about the convex cone of the entropy region of 3 jointly Gaussian random variables is formalized 
in the next theorem: 

Theorem 6 (Convex Cone of the Entropy Region of 3 Scalar-Valued Gaussian Random Variables): The convex 
cone generated by the entropy region of 3 scalar-valued Gaussian random variables gives the entropy region of 3 
arbitrary continuously distributed random variables. 

This theorem states that one can indeed construct the entropy region of n — 3 continuous random variables from 
the entropy region of Gaussian random variables and therefore it encourages the study of Gaussians for n > 4. 
Moreover Theorem [6] addresses the "convex cone" of the Gaussian entropy region and as it turns out for most 
practical purposes characterizing the "convex cone" is sufficient. The problem of characterizing the entropy region 



8 



of Gaussian random variables itself, rather than its convex cone, is more complicated. For 3 random variables we 
state the following conjecture: 

Conjecture 1 (Entropy Region of 3 Vector-Valued Jointly Gaussian Random Variables): Let vector g defined as 
.9 = [.91:52,53,512,323,531,3123]' be an entropy vector generated by 3 vector-valued Gaussian random variables. 
Define Xk = e 9i '~ 3i ~ 9: > and y — + 2maxfcXfc — 2~2k x k- The closure of the Gaussian entropy region 

generated by such g vectors is characterized by, 

1) For y < 0: 

9ij < 9i + 9j , 5123 < min(3y + g jk ~ 9j)- (12) 

3 



2) For y > 0: 

9ij < 5i + 5j , 5i23 < y^fffc + log max 



0,-2 + 5^a; fc + 2 /JJ(l-x*) 



(13) 



In other words we conjecture that the entropy region for three Gaussian random variables is simply given by the 
above inequalities. Thus, when y < 0, the Gaussian entropy region coincides with the continuous entropy region; 
however, when y > (and this can happen for some valid entropy vectors), we have the tighter upper bound ( TT3b 
on 3i23- In other words the actual Gaussian entropy region for n — 3 vector-valued random variables is strictly 
smaller than the entropy region of 3 arbitrarily distributed continuous random variables. 

We strongly believe the above conjecture to be true. The missing gap in our proof is a certain function inequality, 
which all our simulations suggest to be true (see Conjecture |2). 



C. Proof of Main Results for n = 3 

In what follows we give the proof of the results stated in the previous section for n = 3. The basic idea is 
to determine the structure of the Gaussian random variables that generate the boundary of the entropy region for 
Gaussians, and then to determine what the boundary entropies are. We need a few lemmas: 

Lemma 2 (Boundary of the Gaussian Entropy Region): The boundary of the Gaussian entropy region is gener- 
ated by the concatenation of a set of vector valued Gaussian random variables with covariance 

aulf ai2$i2 ai3$i3 

ai2$i2 a 2 2lf a 2 3^23 > (I 4 ) 

"13**13 "23*23 a 33lf 

where the $y are orthogonal matrices such that $'3*12*23 = sign(ai2ai3a23)^, and another set of independent 
vector-valued Gaussian random variables with covariance 



(15) 



T-T 














<X22It-T 











"33 I T 
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Proof: To find the boundary region for 3 jointly Gaussian random variables, we can maximize linear functions 
of the entropy vector. We can therefore take an arbitrary set of constants 7 S , s C {1,2, 3} and solve the following 
maximization problem: 

J s h s (16) 



max > 
h ^ 

sC{l,2,3} 



or equivalently, 



max 
R 



E 

sC{l,2,3} 



7 S log det R M , 



(17) 



where R is the 3T x 3T block covariance matrix and i?r s i denotes the submatrix of R whose rows and columns 
are indexed by s. We shall assume all the principal minors of R are nonzero. The optimization problems ( fToTfTTl ) 
also come about when we fix any 6 of the entropies and try to maximize the last one. KKT conditions necessitate 
that the derivative of ( fTTb with respect to R be zero, i.e. ~ \ J2 s <z{i 2 3} 7s log det i?,[ s ] = 0. To compute the 
derivatives we note that log det X ~ X~ l \ However since covariance matrix 7? is symmetric, we can further 
write ^_ log det X = X' 1 . If we adopt the following notation, 



S 



u = 





S12 ) 


■( 


Rn 


R12 \ 


S21 


S22 J 






R22 J 


U22 


t/23 N 




f R22 


R23 I 


U 32 


f/ 33 , 




\ R32 


R33 ) 



,w 



W u W 13 
W31 W33 



R 

R31 R 



13 



3:; 



% = (R-% 



(18) 



Then we obtain, 



f \ 



7i 






+713 








■ 72 



(0 



R 






-1 

22 










'73 








■ 712 



( Wu W13 \ 





y w 31 w 33 



- 723 



/ 





^22 

U32 



v R^ ) 
\ 

^23 



S\2 
S2I S22 











o\ 






U: 



33 



7123-R^ 1 = 



(19) 



/ 



Now if we assume det Ru — af i: au > 0, then we can define the following "unit-determinant" matrix: 



L = 



iR{{ 2 

s/att 11 











y/OL22 zz 











J—R 

V Q 33 



(20) 



33 / 



2 Note that had we taken the derivative with respect to a symmetric matrix X from the beginning, then we had, 
^j>logdetX = 2X~ 1 — diag(X _1 ), where "diag(-)" denotes the diagonal elements of its argument. However this derivation would also 



result in equation U9t . 
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Let Lui, s C {1, 2, 3} be the submatrix of £ obtained from choosing rows and columns of L that are indexed by 
s. Then if we multiply (T% from left and right by £ we obtain: 



7i 



+712 



L [1] R U L [i] 




£[1,2] 





,-1 



"723 



V 








fo 





o ^ 







+ 73 













°y 









£[3]-R33 1 £[3] J 




(£[i,3] WL[ 1>3 ] 


1>1 


o (i[i,#£[i,3]) i2 



2,1 





(£[i >3] ^£[1,3] 



(21) 



2.2 



(22) 



Due to the special structure of L, we have (£[ s j) 1 i?[ s ](£[ s j) 1 = (£ 1 RL 1 )[ s ] therefore if we define 

R = L^RL- 1 

and S, W, U and V also similar to (fTsT >. i.e., 



(23) 





,W 



W u W 13 
W 3 i w 33 




u 



U22 U23 
U 3 2 U 33 



R22 R23 
R32 R33 



, Vij = (R~ 1 ) 



(24) 



it follows that (T% will be satisfied by R, S, W, U, V instead of R, S, W, U, V. In other words we have the following 
equation: 



7i i?n 







7 2fi 2 "2 1 







+723 



V 






73-R33 1 ) 

\ 



( 



+ 712 



Multiplying 






U22 U23 
U32 U33 
by R from the right we obtain, 



S n 


£>12 





S21 


5*22 















\ 



/ 



713 








w 13 ^ 











W31 





W33 J 



1123R- 1 = 



(25) 



71/ jiR^Ru li R u R i3 



\ 



11 

72-R 2 ~2 1 - R 21 72 12R 2 ~2 1 R23 

\ 73-R 3 3 1 - R 31 73^33^32 73^ J 



712 



/ I S n Sia 
/ y 5*21 5*22 





( I W n R 12 + W 13 R 3 2 
+713 

1 W31R12 + W33R32 1 



+ 723 












J 















) 




u 23 \ 


(:) 









+ 7123-f = 


U33 ) 







1 


J 
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Note that equating the diagonal elements to zero results in the following constraints on 7 coefficients: 

71 + 7i2 + 7i3 + 7123 = (27) 

72 + 712 + 723 + 7123 = (28) 

73 + 713 + 723 + 7123 = (29) 

These imply that the equations representing the touching hyperplanes to the Gaussian region should be balanced (see 
Theorem[T|i. Now considering blocks (2,1), (3,1) together, (1,2), (3,2) with each other and (1,3), (2,3) simultaneously 
in ( l26b and noting that Ru = ocul, we obtain 



an 

32-1 

a 22 



3±i 

ail 

32-1 

033 



32-1 

"22 

33-1 

"33 



■ 712 



■713 



' 723 



i?ll i?i2 

R21 R22 

R11 R13 

R31 R33 

R22 R23 

R32 R33 



Simplifying equations d30t-(l32b by multiplying each by the relevant 



(7i+7i2)/ £-Ru 



-R13 
-R23 

-R12 
-R32 

-R21 
-R31 



Rii Rij 
Rji Rjj 



, we obtain: 



^77^21 (72 + 712)/ 



(71 + 713)/ ^#13 
^#31 (73+713)/ 



(72 + 723)/ ^#23 
^i?32 (73 + 723)/ 





-R13 




R23 




R12 




R32 




R21 




R31 



= 



= 



(30) 



(31) 



(32) 



(33) 



(34) 



(35) 



Now if the 2T x T matrix 



^■ik 11 jk 



, i,j, k e {1, 2, 3} were full rank, the rank of the left 2T x 2T matrix 
in either of the equations (f33l>— d35t would be T and therefore the Schur complement of its (1,1) block should be 
zero, i.e.: 

{lj+1a)I — />V?,.;=<> (36 » 

in other words: 

RjiRij = RijRji = [ — " J; WJ '—ctuatjj ) I (37) 



-J-?- r RjiRij — 

auaijjdi +7y) 



(7; + Jij)h 3 +7ij) 



7i7j 



12 



Since R is symmetric, Rji = R\y This implies that off-diagonal blocks of R are multiples of an orthogonal matrix, 



i.e., 



Rij = aij$ij 



(38) 



for some orthogonal matrix $y and a,j such that 



Jilj 



Stating ([38} explicitly, we have: 



(39) 

(40) 
(41) 
(42) 

Replacing for R^ from (|40b— (f42b in equations d33ll— J35b we obtain 6 equations, which turn out to be all the same 
as the following equation when we consider the balancedness constraints of (|27i>-(|2"9l) and definition of in ( 1391 : 

72 ai 2 a 2 3 



Rl2 


= R21 


= "12*12 


Rl3 


= R31 


= ai3$i 3 


R23 


= R 32 


= a 2 3^23 



$1S 



7i + 712 a2«i3 



$12 $23 



(43) 



Simplifying d43j using the fact that ay = ± A / (7 ' +7 '^] 3+7,j) a lr a j3 from 



we obtain that: 



$, 



Now note that in the general case 



-$12$23 if "12013^23 > 

-<&i2 < &23 if oti 2 aiz<y.23 < 



(44) 



Dt p£ 



is a T x T unitary matrix 0,*j such that: 



Writing explicitly: 



in equations (f33l) — (f3Sb need not be full rank. Therefore there 



(45) 











Rik 




Rik 







Oij - 






. Rjk 




_ Rjk 






-Rl3 


#12 — 


-Rl3 





5 


-R12 


613 — 


R12 







-^21 


#23 — 


R2I 





R23 




R23 







R32 




R32 







^?31 




R3I 






(46) 



where 



Rik Rjk 



is now full rank and we can assume its column rank (as well as its column size) to be Tij 
where < T. This suggests doing a similarity transformation on R with the following unitary matrix without 
affecting the block principal minors: 

hs 

<->= flis 1 47) 

r, l2 



From which we obtain: 



a ll-f #23-^12^13 023-^13012 

0*3-^21023 a 22l 0*3-^23012 

0j 2 i? 31 023 0^2-^32031 "33-^ 



(48) 
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Considering i?2i#23 and #3 simultaneously and using d45b we have 

^21^23 = ( R21 

013-^21 = (-^12^13)* 

Therefore we can simply obtain the following structure for 6*^3^21^23: 

R21 




013-R21«23 — 







(49) 
(50) 

(51) 



where the dimension of R21 is T13 x T23. A similar argument for other elements yields the following structure for 
0*i?0: 

®1iIt 23 i?i2 i?i3 

anI T -T 23 



-R21 


^31 



Now define R' = 0*i?0 and similar to 



a 2 2^T 13 i?23 

a 2 2^T-Ti3 

i? 32 a 33 I Tl _ 
a 33 I T -T 12 

let S' = {R' M )-\W = (R[ li3] )-\U> = (R'^y 1 and V = R! 



(52) 



where for s C {1, 2, 3}, R'^ denotes the submatrix of R' obtained from choosing the block rows and block columns 
indexed by s. If we multiply (l25T l from the left by 0* and from the right by 0, then it turns out that (l25t is satisfied 
when R, S, W, U, V are replaced R' , S' , W 7 , [/', V'. Therefore we can write equations d33T>-(l35l> for R'. Doing so 
gives the following: 

^R'ji (7J+7«)J 

which if we replace for values of R'^ from d52l and assume that and Tji represent the same value, we obtain: 





R 'ik 




. RI J k . 



= 



(53) 



(7; +li])Ir ]k 




From which it follows that: 





T-T jh 






7j 












(7j +7ij) J T-o 



^ife 








= 



(54) 



-i? 7 



(7i + -yij)iT ik 



Rik 

Rjk 



Stating d55l l explicitly, we have the following 3 equations: 

(7l+7l2)/T 2 3 £r-Ri2 



-^R21 

an Z1 



Q22 " 

(72 + 7i2)-7r 13 



-R13 
-R23 



= 



(55) 



(56) 
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(7i 


+ 713)/T 23 




^i?l3 




Rl2 




ail J1 


(73 


+ Jl3)lT 12 




R32 


(72 


+ 723)^X3 




^R-23 




R-21 




^^ 32 


(73 


+ 723)^Ti 2 




R3I 



= 



Note that the dimension of 



K lk H jk 



least Tij. Hence if we let the rank of the left matrix in (|55l l be r we will have: 



(57) 



(58) 



is (Tjk + Tik) x T^. Therefore nullity of the left matrix in (l55l l is at 



r < Tjk + T lk - 



(59) 



On the other hand it is also obvious that: 



r > T jk , T lk 



(60) 



From ( 159b and d60b it follows that: 



Tij < mm(Tjk,Ti k ) 



(61) 



Since a similar argument can be used for Tjk and Tk we conclude that: 

Trri rri _A_ rp 

12 — ^ 23 — 1 13 — 1 



(62) 



Now note that d56T>— (f58b is similar to d33])-(l35l) with Rij instead of Rij. Therefore the same argument that led to 
yields that Rij is a multiple of an orthogonal matrix say ^y; in other words 

(63) 
(64) 
(65) 



R12 


= R21 


= "12*12 


R\3 


= R31 


= "13*13 


R23 


= R32 


= "23*23 



where similar to d39b . oiij is given by 



7i7j 



(66) 



and we have, 



13 



+ *12*23 if "12"13«23 > 
-*12*23 if «12«13Q ; 23 < 



(67) 
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Note that sign of ay can be chosen arbitrarily in d66j >. Finally it follows that after a series of permutations on 
and substituting for values of .Ry from d63b-d65b. R' = 9*i?0 can be written as follows: 





ai2$i2 


"13*13 












"23*23 








"13*13 


"23*23 


a 33 If 














anJ T _ 


_f 














OL2ll T _ 


f 

















(68) 



which if viewed as the timeshare of a set of Gaussian random variables with an orthogonal covariance matrix and 
another set of independent random variables, it has the same block principal minors as d52l . Moreover R' has the 
same principal minors as R and R and therefore ( f68b is the optimizing solution to problem ( fTTT i. However note 



that d68| ) is an optimal solution only if it is a positive semi-definite matrix. Therefore ay's and *y's should be 
such that§ 



det 



an It "12*12 "13*13 

"12*12 "22-^T "23*23 
y ai 3 $f 3 «23*23 "33^T J 



> 



(69) 



(70) 



Lemma 3 (Block Orthogonal, Block Diagonal Covariance): Consider the covariance matrix 

"ll^T "12*12 "13*13 
R = "12**2 "22-^T "23*23 
>13 "23*23 



ai 3 $* 13 «23*23 "33-^T 



(71) 



where the $y are orthogonal, an > 0, and aaotjj > ay in the 2x2 block principal minors my = (auctjj — a?-) T . 



Then 



("11"22"33 ~ "11"23 - "22"? 3 ~ "33"i 2 ~ 2|ai2ai 3 a23 |) < det R 

< ^anQ;22a33 ~ "ll"23 _ "22"? 3 - "33"l2+ 2|ai2ai 3 a23 



(72) 



where <£> = $* 3 $i2*23 and the upper bound is tight when $ + <!>' = 21 and the lower bound is achieved when 
$ + $* = -21. 



3 Note that otij and the set of 7i,7ij are dependent through (66). However it can be shown that for a given set of ai,aij the values of 
7i > Hj can be determined such that )66t and I27M29) will hold. 
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Proof: We can easily write the following, 



det R 



■det 



OL\\OL<2i1t Q!lia23*23 




( ^12*12 aii 



3*13 



= -sr det ((aiia 22 - a? 2 )( a na33 - ot\ 3 )I T - (011023*23 _ ai2ai3**i3*i2)(aiia23*23 - "12013**12*13)) 
"li 

= dot ^(aiia 22 a33 - ana^ _ "22^3 _ a 33 al 2 )I T + ai2ai 3 a!23(* t i3*i2*23 + $23*12*13)) (73) 

The result immediately follows from -21 <$ + $'< 21. ■ 

The lxl and 2x2 minors of the co variance matrix structure d68l l obtained in Lemma [2] can now be written as, 



[OLaCXjj CVij) ( a ii(Xjj) 



(74) 
(75) 



Moreover since based on d67l ), for matrix (|68j, $ = $' 3 $i2*23 = sign(ai2tti3a23)^, based on Lemma [3] mi 2 3 
of d68l i is given by: 



(anQ!22a33) T * (aii«22a33 - an«23 ~ "22^13 



"33012 ± 2|ai2ai3a23|) 



(76) 



However these values can also be obtained by a timeshare of 3 scalar random variables with covariance matrix, 



(77) 



an ai2 a 13 

Ct\2 «22 «23 
\ ai3 «23 OL 33 J 

and 3 other independent scalar random variables. This suggests that the region of 3 vector-valued Gaussian random 
variables may be obtained from the convex hull region of 3 scalar Gaussian random variables. In other words for 
n = 3, considering vector-valued random variables will not give any entropy vector that is not obtainable from 
scalar valued ones. This is essentially the statement of Theorem|5]and we can now proceed to a more formal proof. 
Proof of Theorem \5J[ As in Lemma |2j we write the following optimization problem, 



max 
R 



^2 7s log det R s , 

sC{l,2,3} 

As was obtained in Lemma [2] the optimal solution is of the following form: 



(78) 



anlf 





«12*12 





«13*13 








"ll^T-f 














0:12*12 





a 2 2-7f 





023*23 






















"13**13 





"23*23 





a 33 If 




















"33 1 



(79) 



T-T 
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where $y are orthogonal matrices and $13 = sign(ai2ai3a23)*i2*23- Now let 



and define 



R l 



V 





















\ 













1 T~T 

























































I T - 


f 

























*f 3 













{ 














It-t j 






aulf 
















Oi\ 3 If 










f 


































0:23*12*23**13 




















Oi22l T _ 


-f 








a 13 If 







«23 


*13*23 


*'l2 





a 33 If 




























"33 



(80) 



(81) 



T-T 



J 



However since $12*23*13 = i-^f' 11 follows that all blocks of BP are diagonal and therefore BP can be viwed 
as a timeshare of scalar random variables. Moreover since Ffi has the same principal minors as R, therefore R^ 
is also an optimal solution of (17 St , | 

In order to proceed to the proof of Theorem [6] we need the following lemma, 
Lemma 4: Consider the function 



0,-2 



\ 1=1 



(82) 



where < xi < 1, for Z = 1,2,3. / is either a constant function equal to min^j ■ XiXj or has a unique global 
maximum given by: 

max/(<5) = — — = min x^a;, (83) 
5 max/(x;) i,je{i,2,3},tyj 



Moreover if we let y 



- + 2 max; Hi; — then 
lfy<0 max/(<5) 



5>1 



11^ 

max; (a;;) 



(84) 



Proof: See Appendix. ■ 
Corollary 1: Let 6* = in Lemma [4] Then /(|) is either a constant function equal to min^j XiXj or has a 
unique global maximizer such that maxg /(^) = mirij^j XiXj. Furthermore, 



If y < => 

Now we can proceed to the proof of Theorem [6] 



max /(-) = 
o<e<i max;(a:;) 



(85) 
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Proof of Theorem [6} We show that the entropy region of 3 continuous random variables can be generated from 
the convex cone of the Gaussian entropy region. To this end we prove that any entropy vector of 3 continuous 
random variables lies in the convex cone of Gaussian entropies. Let g be an arbitrary entropy vector corresponding 
to 3 continuous random variables. We know that the only inequalities that constrain the entries of g are 

9%] < 9i + gj, 9123 + 9k < 9ik + 9jk (86) 



Let p = e 9 where the exponential acts componentwise. Then the equivalent set of constraints to j86t are 

Pi > 0, < Pij < p iPj , < P123 < (87) 

Pk 



We now show that any such p-vector can be obtained from Gaussian random variables. Consider the structure 
obtained in Lemma [2] which suggested the time-share of a set of independent random variables with covariance 
matrix of block size T — T and another set of random variables with orthogonal covariance matrix of block size 
T. We try to find an, a« in structure d68l that will yield the desired pi and pu. Therefore if m is the vector of 
block principal minors of structure d68l we need to obtain ?tit = p. Using (|74l i and d75l > we can solve for an and 
Oij and obtain 



a u = Pi > 0, Oij = ± x PiPj(l - (— )*) (88) 

V PiPj 

Now we need to show that P123 falls within the set of achievable values of m[ 23 . Note that calculating the 
determinant of the matrix in d68l l via Lemma [3] or equation d76l l gives 

maxdeti? = (aua 2 2a33) T ~ f (ana22a33 _ a lx a\ 3 - a 22 a? 3 - a z?> a\ 2 + 2|ai2ai 3 a2 3 |) (89) 

where the max is achieved when ai2ai3a23 > 0. Letting 6 = ^ and replacing for values of an and aij in terms 
of pi and p^ from ( 1881 yields 

maxmf 23 = P1P2P3 -2 + + + 

\ \PlP2j \PlP3j \P2P3 




(90) 



Of course this corresponds to the determinant of a covariance matrix of the time-share of some Gaussian random 
variables only if the term inside the outer parenthesis in d9"0l ) is positive. Therefore assuming x\ — ^2 = 



■£±2- x 3 = - £12l - and using d82l in Lemma El 
P1P3 ' 3 V1P2 & 1 — ' '—' 



i 1 

maxmra =PiP2P3f{-z) (91) 



It remains to show that for any given P123 that satisfies the latter condition of d87l i. we havepi23 < P1P2P3 sup $ g /(a). 
Since by d87l l. P123 can be as large as min ( j' ik J' k jk J we really need to show that P1P2P3 sup /(|) achieves 
for some value of 8. Therefore we need to compute, 



Pk 



1 1 
supm^ 3 =pxp 2 p3 sup /(-) (92) 
o<e<i f 
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Note that since we have fixed pi and pij, and that 9 represents the timesharing of 2 sets of random variables, 9 = 
is not generally allowed (otherwise we enforce the random variables to be independent which is not necessarily 
the case for given pi and p^). Therefore we have used sup instead of max in d9"Tl i. To find sup/(l) with respect 
to 9 over < 6 < 1, note that as stated in Lemma /(a) has a (unique) global maximum with a value of 
mirij^j XiXj — min a j some 9 = 9$. If for the assumed values of X\, X2,X3 (obtained from the fixed values 

of pi and p^), /(|) achives its maximum for < < 1, i.e., < 9o < 1 then 

4* • PjkPik ■ PikPik /r\o\ 

sup rn{ 2 „ = P1P2P3 mm — ^ = mm — (93) 

$, o<e<i PiPjPk Pk 

This immediately gives that the vector p = e 9 is achievable. Otherwise if 9q > 1, then for some 6' > 9q, define 
the vector p' =p»' (elementwise exponentiation). This means that = p?' = p^' and hence x\ — xf . Now 
we try to achieve vector p 1 by Gaussian structure of (l68t . For this purpose we follow similar steps as above for p' 
and let ml to be the vector of block principal minors of the new corresponding matrix. Then we have 

sup m! f 23 = P1P2P3 max ( /( — ) ) (94) 

The global maximum of / will now happen for 9 = ffi < 1 at which it will have the value (min p ^ p ^ k ), Replacing 
this in d94l i gives sup o<0<1 m'[ 23 = min P: > k J^ k which means that although p was not achievable with Gaussians 
p' = pv is achievable. The result of the theorem is then established by noting that p' corresponds to a valid entropy 

vector g' = -p-g, i.e., a scaled version of g. Note that if maximum of / happens at infinity, i.e., 9q — > 00, then 

1 

we should consider a sequence of scaled vectors p\ — p e 'i (or equivalently <?■ = jp-g) where 9\ is an unbounded 
increasing sequence in i. As i — > 00, g[ will asymptotically fall in the Gaussian region (a small perturbation of 
g'i, i — > 00 will put g' i in the Gaussian region). Hence g will belong to the closure of the convex cone of the 
Gaussian region as well. | 

Conjecture 2: In Lemma [4] if for some 8 > 0, we have y(S) > 0, then max s> ^ f(5) = f(S). In particular 

lfy = y(l)>0 ==► max/(<5)=/(l) (95) 

o>l 

Moreover if we let S = i, then (l95t translates to 

lfy = y(l)>0 o max/(i) = /(l) (96) 



Simulations of the function / for different values of 5 (Fig. |2]in the Appendix) support the statement of Conjecture 
Our Conjecture Q] relies on the above. 

Proof of Conjecture^ Assuming Conjecture^}; To find the Gaussian entropy region again we employ Lemma [2] 
to obtain the boundary entropies of the region. Hence we consider the structure of d68l > which is obtained from the 
time-share of a set of independent random variables with covariance matrix of block size T ~T and another set of 
random variables with orthogonal covariance matrix of block size T. Let m be the vector of block principal minors 
of the matrix in d68l ) and let q = tot where the exponential acts componentwise. Moreover denote the corresponding 
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entropy vector by g = log q (log acting componentwise). Then we would like to characterize the set of g-vectors 
(equivalently g-vectors) that can arise from d68l ) (i.e., they lie in the convex hull of Gaussian structures). First let 



us investigate the constraints on and gy. It is easy to see that an — qi > and ay = ±ygi^(l — (^p)* 1 )• 
Therefore the imposed constraints are 



9* > 0, < qij < qrfj 



(97) 



where gy > is due to the positivity requirement of the matrix. Next we would like to obtain the limits of 5123. 
For this purpose assume that and gy are given fixed numbers satisfying J97l ). Determinant of matrix d68l is 
obtained from d76*l ). If we let 6 = ^ and substitute for a* and ay in (|76l > we obtain: 



maxgi2 3 = gig293 -2 



912 

9i 92 



/ 913 \ * ( 923 



V9i93/ V9293 



+2 



\ 




(98) 



Note that as described in the proof of Theorem [6j we should insist on the positivity of the expression inside the 
outermost parenthesis in d98l ). Therefore defining x\ 



q23 

9293 ' 



X2 



X3 



and using ( f82t in Lemma HI 
9192 1 — ' 1— 1 



Slip gi23 

o<e<i 



qi q 2 q 3 max/(-) 



(99) 



where since g; and q^ are fixed, we have excluded 9 = and used sup instead of max for gi23 in ( 1991 ) so that we 
do not enforce indepndence of the underlying random variables. 3 Using Corollary Q] and assuming that Conjecture 
|2] holds we obtain, 



max /(-) = 



maxi(xi) 
/(I) 



If y <0 
If y >0 



(100) 



where y 



III 



- 2 max; xi — J^i x i- Replacing for X4 in ( 1 1001 ) in terms of g-vector entries and using the result of 
j 100b in ( |99] > with the final substitution of g-entries in terms of entropy elements of g = log g, yields the cojecture 
result. Note that when y < the characterization of the region is known perfectly from Corollary Q] The only part 
of the entropy region that is conjectured about is when y > whose proof relies on the validity of Conjecture [2] 



IV. Cayley's Hyperdeterminant 

Recall from (J4j> that the entropy of a collection of Gaussian random variables is simply the "log-determinant" of 
their covariance matrix. Similarly, the entropy of any subset of Gaussian random variables is simply the "log" of 
the principal minor of the covariance matrix corresponding to this subset. Therefore one approach to characterizing 
the entropy region of Gaussians, is to study the determinantal relations of a symmetric positive semi-definite matrix. 

4 Note that if the underlying random variables are independent, then f(9) will be independent of 0. 
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For example, consider 3 Gaussian random variables. While the entropy vector of 3 random variables is a 7 
dimensional object, there are only 6 free parameters in a symmetric positive semi-definite matrix and therefore 
the minors should satisfy a constraint. It has very recently been shown that this constraint is given by Cayley's 
so-called 2x2x2 "hyperdeterminant" [20|. The hyperdeterminant is a generalization of the determinant concept 
for matrices to tensors and it was first introduced by Cayley in 1845 ll32l . 

There are a couple of equivalent definitions for the hyperdeterminant among which we choose the definition 
through the degeneracy of a multilinear form. Consider the following multilinear form of the format (fci + 1) x 
(k 2 + 1) x ... x (fc„ + 1) in variables Xi, . . . ,X n where each variable Xj is a vector of length (kj + 1) with 
elements in C: 

f(Xi,X 2 , . . . , X n ) = 

k± &2 k n 
i 1= 0i 2 =0 i„=0 

The multilinear form / is said to be degenerate if and only if there is a non-trivial solution (Xi,X2, ■ ■ ■ ,X n ) to 
the following system of partial derivative equations [ 33 1 : 

df 

= for all j = 1, . . . , ri and i = l,..., kj (102) 

The unique (up to a scale) irreducible polynomial function of entries a^fa ,i n with integral coefficients that 
vanishes when / is degenerate is called the hyperdeterminant. 

( x \ / I/O \ / a oo aoi 

, Y = , and A = | . Consider 

xi J \ yi J \ a 10 an 

the multilinear form f(X, Y) — j=o a i,j x iVj = X t AY. The multilinear form / is degenerate if there is a 
non-tirivial solution for X, Y such that 

%=AY = Q (103) 
oX 

df 

-L = A t X = (104) 
oY 

A nontirival solution exists if and only if dct A = 0. Therefore the hyperdeterminant is simply the determinant in 
this case. 

The hyperdeterminant of a 2 x 2 x 2 multilinear form _ 5Zi 2 =o Si 3 =o a i\iii-i x i\ x ii x iz was fi rst computed 
by Cayley l32l and is as follows: 

_2 2_2 2_2 2_2 2 
a 000 a lll a 100 a 011 a 010 a 101 a 001 a 110 

— 4aoooanoaioiaon — 4aiooaoioaooiain 
+2aoooaiooaoii a m + 2aoooaoioO'ioiaiii 
+2aooo a ooi a iio a m + 2aiooaoioaioiaon 

= (105) 
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In 11201 it is further shown that the principal minors of an n x n symmetric matrix satisfy the 2 x 2 x . . . x 2 
(n times) hyperdeterminant. It is thus clear that determining the entropy region of Gaussian random variables is 
intimately related to Cayley's hyperdeterminant. 

It is with this viewpoint in mind that we study the hyperdeterminant in this section. In the next 2 subsections, 
first we present a a new determinant formula for the 2x2x2 hyperdeterminant which may be of interest since 
computing the hyperdeterminant of higher formats is extremely difficult and our formula may suggest a way of 
attacking more complicated hyperdeterminants. Then we give a novel proof of one of the main results of [20| that 
the principal minors of any n x n symmetric matrix satisfy the 2 x 2 x . . . x 2 (n times) hyperdeterminant. Our 
proof hinges on identifying a determinant formula for the multilinear form from which the hyperdeterminant arises. 

A. A Formula for the 2x2x2 Hyperdeterminant 

Obtaining an explicit formula for the hyperdeterminant is not an easy task. The first nontrivial hyperdeterminant 
which is the 2x2x2, was obtained by Cay ley in 1845 [32 1 . However surprisingly calculating the next hyperde- 
terminant which is the 2x2x2x2 proves to be very difficult. Until recently the only method for computing the 
2x2x2x2 was the nested formula of Schlafli which he had obtained in 1852 ll34l IT331 and although after 150 years 
Luque and Thibon [35| expressed it in terms of the fundamental tensor invariants, the monomial expansion of this 
hyperdeterminant remained as a challenge. It was finally solved recently in [36] where they show that the 2x2x2x2 
hyperdeterminant consists of 2,894,276 terms. It is interesting to mention that Cayley had a 340 term expression for 
the 2x2x2x2 hyperdeterminant which satisfies many invariance properties of the hyperdeterminant and only fails 
to satisfy a few extra conditions ||37l . Therefore, as mentioned previously, computing hyperdeterminants of different 
formats is generally nontrivial. In fact even Schlafli's method only works for some special hyperdeterminant formats. 
Moreover according to ll33l it is not easy to prove directly that ( I105l l vanishes if and only if ( I102l i has a non-trivial 
solution. Here we propose a new formula for (and a method to obtain) the 2x2x2 hyperdeterminant which shows 
this is an if and only if connection directly. Moreover this method might be extendable to hyperdeterminants of 
larger format. 

Theorem 7: (Determinant formula for 2x2x2 hyperdeterminant) Define 



«000 «100 




«010 a 110 


, J = 


-1 








«001 OlOl 




flon a lll 




1 



Then the 2x2x2 hyperdeterminant is given by 

det( J B JB\ - BxJBl) = ( 1 06) 



Proof: Let / be a multilinear form of the format 2x2x2, 

1 

f(X,Y,Z)= J2 

(107) 

i,j,k=0 
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Then by the change of variables, wo = xoyo , w\ — xiyo , u>2 = xoyi , W3 = Xiy\, the function / can be written 
as. 



f(X,Y,Z)= U Zl ) 



a 0O0 a 100 a 010 a 110 

aooi a ioi a on °iii 



\w 3 J 



4 z 1 



B Bx )W 



(108) 



To proceed, recall from ( 11021 that the hyperdeterminant of the multilinear form of the format 2x2x2, vanishes 
if and only if there is a non-trivial solution (X, Y, Z) to the system of partial derivative equations: 



— =0 — =0 — — 1, j, fc = 0, 1 

OXi dyj dz k 



(109) 



(a) First we show that if there is a non-trivial solution to the equations dl09t , then ( 1106b vanishes. By the chain rule 



Of 



J2k 



3u)k df 



-a — a — , we can write „, „ „. 
the degeneracy conditions equivalent with ( |109t become: 

dW 



df 



( dW V 



Also from (flOBl . || = ( S Si )W. Therefore 



9/ 







d{X,Y)J dW 
(B B 1 )W = 



Condition (II 10b implies that the vector should belong to the null space of the matrix ( - 



OW 
d(X,Y) 



(HO) 
(HI) 

The following 



Lemma gives the structure of this null space. 



Lemma 5: The null space of the matrix 



dW 
d(X,Y) 



Proof: Let V be a 4 x 1 vector. Noting that for j = {1,2}, ( gp^y ) 



is characterized by vectors of the form, (ws —u>2 —Wi u?o)'- 

and for j = {3, 4}, 



dwj 



( dW \ 



dVj-3 



, we have: 



dW 
d(X,Y) 



V 



( 


?/0 





2/1 





\ 


/ 


Vl 


\ 












2/1 






>'2 






Xo 














V 3 




V 








x 


Xl 


/ 


V 




J 



= 



Solving for V in the above, yields the equations: 



Vl _ v 2 _ 


_m_ 


V 3 Vi 


yo 


Vl _ v 3 _ 


Xi 


V2 V4 


x 



(112) 

(113) 
(114) 



Letting W4 = xoyo characterizes the vectors in the null space up to a scale: 

V = ( xiyi -x yi -xiy x yo ) 

t 

— w 2 —Wi Wq 



(115) 
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We further note that provided 




^ and 




^ 0, the matrix 



yi ] 

therefore the only null space vector (up to a scaling). 

Going back to the proof of Theorem|7] using Lemma[5]we conclude that we should have, ^ 
and for an arbitrary non-zero scalar a, = ( Bq B\ )*Z = a ( 
two equations into matrix form we can further write the following: 



has rank 3 and that V is 



W3 —W2 —Wl Wq 



Bo Si )W = 
Putting these 



/ 

V Bo Si 



Bo 1 \ 
° / 




a 



w 3 

-W2 
Wq 






(116) 



or in other form: 



V 



\ 


I - 




a 


J 


\ 1 ° 


o A 


/ 


1 o) 


I 


Bo 


Si 



Bo' 

B^ 





(117) 



A non-trivial solution for X, Y, Z and hence for W, Z requires the matrix to be low rank. Evaluating the determinant 
we have: 

\ 



det 





aJ 
\ B 



a 


Bi 



J Bl 







= dot 




det - Bo Bt 



a 2 det 



Bo B\ 



\ 





(118) 



(119) 



Using the fact that J 1 = — J we can write the following, 



det ( B Q Bi 



det(B XBi* - B x JBo 1 







J \ / Bo 1 
-J ) \ B x * 
Note that the explicit calculation of ( 11201 ) gives, 

2(aiooaoio — aoooQ-iio) aiooion + »ioiaoio — aoooiiii — aooi a no 

aioo a on + aioi a oio ~ aooo a m — aooiano 2(aioiaon — aooi a m) 

which when expanded gives the 2x2x2 hyperdeterminant formula stated in equation ( 1105b as expected, 
(b) Conversely suppose that ( 1120b vanishes and therefore there is a non-trivial solution for W and Z in ( II 17b . To 
prove that there is also a non-trivial solution to ( 1109b . we need to show that such X, Y and Z exist so that (111 0b 



det 



(120) 



= (121) 
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and (11 1 11 1 hold. By definition of wq, W\, IU2, W3, it is not hard to see that a valid xo,xi,yo and y\ can be found 
from Wi only if W = ( wq W\ W2 W3 Y m (H3 nas the property, 

^ = ^ (122) 

In the following we show that the solution of dl 17b in fact satisfies relation ( 1122b . Let p ~ wq wi 1 an d 
g = ( W2 w 3 ^ ■ Then from ( 1117b we obtain: 

aJq + B a l Z = Q (123) 

-aJp + B 1 t Z = (124) 

Bop + B iq = Q (125) 
Multiplying the first equation by p l and the second one by q l and adding them together we obtain, 

a(p 4 Jq - q* Jp) + (p'Bo* + ^Bi*)^ = (126) 
which by the use of ( 1125b simplifies to: 



Noting that p t Jq = (p t Jq) t = —q t Jp gives, 



ptjq^qtjp (127) 



p t Jq = q t Jp = (128) 



( 1122b then follows immediately from ( 1128b by substituting for p and g. ■ 

B. Minors of a Symmetric Matrix Satisfy the Hyperdeterminant 

It has recently been shown in [20] that the principal minors of a symmetric matrix satisfy the hyperdeterminant 
relations. There, this was found by explicitly computing the determinant of a 3 x 3 matrix in terms of the other 
minors and noticing that it satisfied the 2x2x2 hyperdeterminant. In this section we give an explanation of 
why this relation holds for the principal minors of a symmetric matrix. The key ingredient is identifying a simple 
determinant formula for the multilinear form ( 1101b when the coefficients a,i lt i 2 ,.„,i n are the principal minors of an 
n x n symmetric matrix. 

Lemma 6: Let the elements of the tensor [wii,i 2 ,...,i n ], ik G {0, 1} be the principal minors of an n x n matrix 
A such that m^,^,...,^, ik 6 {0, 1} denotes the principal minor obtained by choosing the rows and columns of 
A indexed by the set a — {k\ik — 1} (by convention when all indices are zero aoo...o = !)■ Then the following 

multilinear form of the format 2 x 2 x . . . x 2 (n times), 

1 

f(X 1 ,X 2 ,-.-,X n )= 
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can be rewritten as the determinant of the matrix F, i.e., f(Xi,X2, ■ ■ ■ ,X n ) = det(F) where F is the following 
matrix: 



F = 



xi,q 

X 2 ,0 











/ xi,i 



X2,l 







o \ 





.4 



(130) 



Proof: First note that determinant of F has the form, det(F) = 



-Q Ui\ ,22 ,-..,in "^2,22 * " * ^n,2, 



for 



some A-dependent coefficients bi^i,,...^,, (this is because each x^. appears only in the jth row of F). To prove 
that det(i ;l ) is in fact equal to (11291 l. we need to show that fe^,^, = m 



»i,»2,...,in) •••*n or m other words 



are the corresponding minors of A. 



Let (pi . . be a realization of {0, 1}". For j = 1, ... ,71, let the variables a;j iPj = 1 and the rest of the variables 



be zero. This choice of values makes det(F) 



.,„„ and f(Xx,X 2 , ...,X„ 



Moreover it 



can be easily seen that in this case det(F) in ( 1130b will simply be equal to the minor of the matrix A obtained 
by choosing the set of rows and columns a C {1, . . . , 71} such that pj = 1 for all j 6 a. By assumption this is 
nothing but the coefficient Tn Pl!P2> ,,, tPn in (1129b and therefore the lemma is proved. ■ 
Remark: Note that Lemma [6] does not require the matrix A to be symmetric. 

Lemma 7 (Partial derivatives of det F): Let otj — {1, . . . , n}\j and ajc = {1, . . . , n} \ k. Computing the partial 
derivatives of det F gives: 

ddetF 



ddetF 
dxj 1 



detF„ 



(131) 
(132) 



J ' 1 k=l 

where ctjk denotes the (j, k) entry of A and F a . ak denotes the submatrix of F obtained by choosing the rows in 
ctj and columns in a^. 

Proof: Consider the expansion of detF along its jth row as detF = Y^k=i ^jk(~^y +k detF aj>ak , Noting 
that for k ^ j, Fjk = xj^djk and for j = k, Fjj = Xjfi + Xj t idjj, we obtain that 



det F = x 3 s a j , k (-iy +k det F a ^ ak + x j}0 det F aj , aj 
fe=i 



(133) 



Taking partial derivatives immediately gives d 1 3 lb and ( 11321 ). ■ 

Now we can write the condition for the minors of A to satisfy the hyperdeterminant: 

Lemma 8 (rank of F): The principal minors of matrix A satisfy the hyperdeterminant equation if there exists a 
set of solutions Xjo and x 1 for which rank of F in ( 1 1 301 > is at most n — 2. 



Proof: If there exists a nontrivial set of solutions Xj.o and Xj t i such that F has rank n — 2 then both ( 1131b and 
(1132b vanish (because all the (n — 1) x (n — 1) minors also vanish). But the vanishing of (1131b and ( 1132b simply 
means that the minors of A satisfy the 2 x 2 x • • • x 2 (n times) hyperdeterminant. ■ 
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Theorem 8 ( hype rdete rminant and the principal minors): The principal minors of an n x n symmetric matrix A 
satisfy the hyperdeterminants of the format 2 x 2 . . . x 2 (k times) for all k < n. 

Proof: It is sufficient to show that the minors satisfy the 2 x 2 . . . x 2 (n times) hyperdeterminant. Recall that 
for the tensor of coefficients i a j„ in the multilinear form fllOlt to satisfy the hyperdeterminant relation, there 
must exist a non-trivial solution to make all the partial derivatives of / with respect to its variables zero. Lemma 
© suggests that a set of nontrivial Xj t o and Xj,i for which rank of F is at most n — 2 would be sufficient. In the 
following we will show that one can always find a solution to make rank(F) < n — 2. 

First we find a non-trivial solution in the case of 3 variables and then extend it to the the case where there are n 
variables. For 3 variables, the matrix F which is of the following form, 

I £i,o + £1,1011 xi.ioia £1,1013 

£2,1012 £2,0 + £2,l a 22 £2,1023 (134) 

V £3,1013 £3, 1023 £3,0 + £3,1033 J 

should be rank 1 or equivalently all the rows be multiples of one another. Enforcing this condition results in 3 
equations for 6 unknowns. Therefore without loss of generality we let Xj t x = 1. Making the rows of F proportional, 
gives: 

£1,0+ Oil _ 012 _ 013 



(135) 
(136) 



Ol2 £2,0 + 2 2 a 23 

Ql3 Q23 £3,0 + Q33 

Ol2 £2,0 + 022 023 

If Xj = (xjfi,Xj,i), then the solution to the above equations is clearly as follows: 

_ ,012013 — 0,11023 , 

£1 = ( ; ,1) 

023 

_ ,023012 — 013022 \ 

£2 = ( ; , 1) 

013 

_ ,013023-012033 

£3 = ( = , 1) (137) 

012 

Now for the general case of n variables, let x~i,X2,x~3 be as ( 11371 ) and for j > 3, Xj = (1,0). It can be 
easily checked that this solution makes the matrix F of rank n — 2 and therefore the principal minors satisfy the 
2 x 2 x ... x 2 (n times) hyperdeterminant. ■ 

Notation 1: Each element to^^,...,^, where ik € {0, 1}, can alternatively be represented as m a , a C {1, . . . , n} 
where a — {k\ik = 1}. For example, mioo = mi and mon = rri23- 

Since based on Theroem |8] the principal minors of a symmetric matrix denoted by ,...,i n satisfy the 
hyperdeterminant, we may write the 2x2x2 hyperdeterminant relation of d 1 05t for the principal minors of 
a 3 x 3 matrix. Adopting notation Q] we obtain: 

TO m i23 + m i m 23 + TO 2 TO i3 + m i m i2 + 4m0TOi 2 mi3TO 2 3 + 4mim 2 m 3 mi23 - 2m0mim23Wi23 
— 2m0m2mi3mi23 — ^m^m^m^m^ — 2mim2mi3ra23 — 2mim3mi2m23 — Imimzmxirrwi — (138) 



Note that the solutions U371 also appear in [20!j in an alternative proof of principal minors satisfying the hyperdeterminant relation. 
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Letting mj = 1, the 2x2x2 hyperdeterminant can also be written as, 

(wi23 - m 3 mi2 - m 2 mi 3 - mim 2 3 + 2mim 2 m 3 ) 2 = 4(mim 2 - m,i 2 )(mim 3 - mi 3 )(m 2 m 3 - TO23) (139) 

V. Minimal number of conditions for the elements of a (2" - 1) -dimensional vector to be the 

PRINCIPAL MINORS OF A SYMMETRIC 71 X n MATRIX FOR n > 4 

In order to determine whether a 2" — 1 dimensional vector g corresponds to the entropy vector of n scalar jointly 
Gaussian random variables, one needs to check whether the vector e 9 corresponds to all the principal minors of a 
symmetric positive semi-definite matrix. Define A = e 9 and let the elements of the vector iel 2 _1 be denoted 
by A a , a C {1, . . . , rj}, a ^ 0. An interesting problem is to find the minimal set of conditions under which the 
vector A can be considered as the vector of all principal minors of a symmetric n x n matrix. This problem is 
known as the "principal minor assignment" problem and has been addressed before in [20|, 11281 . In fact in a recent 
remarkable work, |20l gives the set of necessary and sufficient conditions for this problem. Nonetheless it does 
not point out the minimal set of such necessary and sufficient equations. Instead ||20l is mainly interested in the 
generators of the prime ideal of all homogenous polynomial relations among the principal minors of an n x n 
symmetric matrix. Here we propose the minimal set of such conditions for n > 4. 

Roughly speaking there are 2" — 1 variables in the vector A and only "^" 2 +1 - ) parameters in a symmetric n x n 
matrix. Therefore if the elements of A can be considered as the minors of a n x n symmetric matrix, one suspects 
that there should be 2" — 1 — " < -" 2 +1 - > constraints on the elements of A. These constraints which can be translated 
to relations between the elements of the entropy vector arising from n scalar Gaussian random variables, can be 
used as the starting point to determining the entropy region of n > 4 jointly Gaussian scalar random variables. 

We start this section by studying the entropy region of 4 jointly Gaussian random variables using the results of 
the hyperdeterminant already mentioned in the previous section and we shall explicitly state the sufficiency of 5 
constraints among all the constraints given in ||20l by using a similar proof to ||20l ; that for a given vector A, and 
under such constraints, one can construct the symmetric matrix A = [5^] with the desired principle minors. Later 
in this section we state such minimal number of conditions for a 2™ — 1 dimensional vector for n > 4. Now define 

c ijk = Aijk — AiAjk — AjAfk A^Aij + lAiAjA^ (140) 

Theorem 9: Let A be a 15 dimensional vector whose elements are indexed by non-empty subsets of {1, 2, 3, 4} 
and has the property that 

Aij < AiAj, ViJC {1,2,3,4}. (141) 

Then the minimal set of necessary and sufficient conditions for the elements of the vector A to be the principal 
minors of a symmetric 4x4 matrix consists of three hyperdeterminant equations ( 1142441441 . one consistency of 
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the signs of Cij k d 145b , and the determinant identity of the 4x4 matrix ( 11461 ): 

c 2 123 = A{A X A 2 - A l2 ){A 2 A 3 - A 23 )(A 1 A 3 - A 13 ) (142) 

c?24 - ±{A X A 2 - A 12 ){A 2 A 4 - A 2 ±){A X A± - A u ) (143) 

c? 34 = A{A X A 3 - A 13 )(A 3 A 4 - A 34 ){A 1 A i - A u ) (144) 

C123C124C134 = 4(^1,42 - Ai 2 )(AiA 3 - A 13 )(AiA 4 - ^14)0234 (145) 

^1234 = -^ > -J-A ~. +^10234 + A2C134 

i',j'e{i,2,3} 3 3 

k',l'e{l,2,3,4}\{i',j'} 

+A3C124 + A4C123 - 2A 1 A 2 A 3 A i + A 12 A 3i + A 13 A 2i + A lA A 23 (146) 

Proof: a) It is easy to show the necessity of equations (1 1 42b — ( 1 1 46b and it was done in [20|. Here we illustrate 
the method to make the paper self-contained. Note that if elements of the vector A are the principal minors of a 
symmetric matrix then by Theorem [8] they satisfy the hyperdeterminant relations which from J 1 39b can be written 
as, 

(A123 - A 3 A 12 - A 2 A 13 - A X A 23 + 2A 1 A 2 A 3 f = A(A X A 2 - A 12 )(A 1 A 3 - A 13 )(A 2 A 3 - A 23 ) (147) 
Using the definition of Cijk in ( 1140b . equation ( 1147b can be further written as: 

cf 23 = 4(A 1 A 2 - A 12 )(AiA 3 - A 13 )(A 2 A 3 - A 23 ) (148) 

Therefore equations (1142b — (1144b simply represent the hyperdeterminant relations and hold by Theorem [8] Moreover 
if Aijk is the principal minor of matrix A obtained by choosing rows and columns i,j and fc, then writing Aij k in 
terms of the entries of A gives, 

Aijk = Q>iiQ-jj&kk ~ ttii^jk ~ ^jj^ik ~ ®kk&ij + 2dijCLj k di k 

= -2AiAjA k + AiAjk + AjAik + A k A tJ + T ai fa ]k ~a lk (149) 



13- 



where since A corresponds to principal minors of A, we have substituted for da = Ai and ajj ~ AiAj — A 
Rewriting ( 1 149b we obtain, 

Aij k AiAj k AjAi k — A k Aij + 2AiAjA k = 2dijdj k di k (150) 

which by comparison to (1140b means that the following holds, 

Cij k = 2dijdj k a,i k (151) 

Therefore replacing for C123, C124, ci 3 4 and C234 from ( 1151b in ( 1145b and simplifying we obtain that ( 1145b holds 



trivially. The last condition (1146b is also nothing but the expansion of the 4x4 determinant of A in terms of the 
entries of A and replacing for them in terms of Cij k and lower order minors and therfore is a necessary condition. 

b) Now we need to show the sufficiency of equations (1142b — (TT4~6b . To do so, we assume that the given vector 
A satisfies (1 1 42b — (1 1 46b and we want to show that it is the principal minor vector of some symmetric matrix A. 
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Hence we try to construct a matrix A whose principal minors are given by the entries of vector A. First note that 
such a matrix A should have an — Ai and a?- = A; A,- — Aij (or equivalently ay = AiAj — Ay). The only 
ambiguity in fully determining the entries of A will therefore be the signs of the off-diagonal entries. To have the 
3x3 minors of A also equal to the corresponding entries of vector A, we should have: 

On @>ij &ik 

'i.i a n a .ik 



Aijk — det 



(152) 



i a-ik a-jk akk J 



— anhjjakk — a ua 2 jk ~ a jj^ik ~ a fcfc^y + ^ciijajkOik (153) 
Replacing for values of an and ay in terms of A; and Ay we obtain that, 

Aijk AiAjk AjAik ~ AkAij + 2A A/ A/c = ^OijajkOik (154) 
Therefore writing the condition for all {i,j, k} C {1, 2, 3, 4} we need to have 

C123 = 2ai 2 a 2 3ai3 (155) 

c 124 = 2ai 2 a 14 a 2 4 (156) 

C134 = 2ai 3 ai 4 a34 (157) 

c 234 = 2a 23 o 24 a34 (158) 

Note that the constraints d 1 42b — (fT45b guarantee that \cijk\ — 2\aijaikO>jk\- It remains to show that there is a 
consistent choice of signs for ay such that the stronger condition cyfe = 2aijdjka,jk also holds. Note that without 
loss of generality we can assume that the off-diagonal entries on the first row, i.e., ay are positive. This is due to 
the fact that we can use the transformation DAD^ 1 where D is a diagonal matrix with ±1 elements to make a\j 
positive without affecting any of the principal minors. Hence, assuming a±j are positive, the signs of a 2 3,a 2 4 and 
CL34 are determined from the signs of ci 2 3, ci 2 4 and C134 in ( 1155141571 ). However once the signs of a 2 3, a 2 4 and 034 
are determined, the sign of their product, i.e., 0,230,240,34 should be the same as the sign of c 2 34 to satisfy ( 11581 ). 
This is enforced by equation ( 11451 ). Therefore conditions ( I142l )-( ll451 l yield (1 1 55b — (1 1 58b . Note that due to property 
d 14 1 b none of the ay are zero and hence all the above steps for sign choice are valid. Finally a direct calculation of 
the 4x4 determinant of A shows that it can be expressed as the right-hand side of equation ( 1146b in terms of lower 
order minors of A which are equal to corresponding terms A a , \a\ < 3. Consequently equation ( 1146b guarantees 
that the 4x4 principal minor of A is equal to ^4i 2 34. Again note that since property ( 1141b holds, the denominator 
in ( 1146b is nonzero and will not cause any problems. Therefore A will be the matrix with principal minors given 
by vector A. ■ 
Using a similar approach which follows the proof methods of [20| closely, we can write the set of minimal 
necessary and sufficient conditions for a 2™ — 1 dimensional vector to be the principal minors of a symmetric 
matrix. 
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Theorem 10: Let A be a 2™ — 1 dimensional vector whose elements are indexed by non-empty subsets of 
{1, . . . , n} (assume A$ = 1) and that it satisfies, 

A a A iujUa < A iUa A jUa Vi,j e {1, ...,n} a C {1, ...,n}\{i,j} (159) 

Then the necessary and sufficient conditions for a 2" — 1 dimensional vector to be the principal minors of a 
symmetric n x n matrix consists of 2™ — 1 — "( n+1 ) equations and are as follows, 

Vj, k C {2,...,n} c? jfe = 4(Ai^ - j4y)(4iAit - A lfc )(A J A fe - A jk ) (160) 
Vi,j,fc C {2,...,n} cujCiikCijk = 4(Ai^j - A lt )(AiAj - Ay)(Ai^fe - Aifc)c ijfc (161) 

Also V/3 C {1, . . . , n}, |/3| > 4 choose one set of i, j, k,l C j3 such that i < j < k < I and let a = /3\{i, j, k, I}, 

D? jkl = (162) 

where Df- kl is obtained from the following by replacing every As, S C {i,j, k, 1} by J| u " , 

Z Ail /±ii ji 

fc / ,/'e{i,j,*,/}\{i',3'} 

Ak^iji AiCijk + 2j4iAjA^j4/ — A^A^i 
-AikAji ~ AuA jk = (163) 

Proof: The proof is essentially the same as the proof technique of |20| and is a generalization of Theorem|9]to 
a 2™ — 1 dimensional vector. However we would like to highlight why (1160l )-( ll621 i is the minimal set of necessary 
and sufficient conditions among all conditions given in |20l . First consider the necessitiy of the equations: 

a) Showing the necessity of equations ( 1 1 60b — ( fT~62T > is strightforward. In fact if the elements of vector A are the 
principal minors of a symmetric matrix A, then an — A.- L and a?- = AiAj — Aij. Furthermore it can be shown 
(similar to the proof of Theorem [9]) that c^fe — 2dijaika,jk from which it follows that (1 1 60b — d 161b hold. Note that 
Dijki = in equation ( 1163b gives A^m in terms of lower order minors (compare to equation ( 1146b ). Now consider 
a submatrix of A whose rows and column are indexed by f3 C {1, . . . , n}, > 4 and denote it by A\p\. For 
{i, j, k, 1} C j3 let a — (3 \ {i,j, k, 1} and likewise let the submatrix with rows and columns indexed by a be shown 
by A[ a ]. Further denote the Schur complement of A[ a ] in A^ by A[ a /m. Note that A\ a /m is a 4 x 4 matrix whose 
determinant can be obtained via the rule of equation ( 1163b . The property of Schur complement yields, 

det (A [am ) = = ^2L, V5 C {ij,k,l} (164) 

v 7 [S] det A[ a ] A a 

Therefore wiritng the determinant for the 4x4 matrix \ a /p\ ar, d using dl64b , gives equation dl62b . PI 

b) To show the sufficiency we show that if a given vector A satisfies equations ( 1160b -( fT62T ) then we can construct 
a symmetric matrix whose principal minors are given by A. As we did in Theorem [9] such a matrix A should have 

6 Note that we can assume g a ^ — oo and simply avoid A a = 0. 
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entries da = Ai and afj — AiAj — Aij (or equivalently dij = ±^JAiAj — Aij). Therefore it remains to choose 
the signs of the off-diagonal entries in a consistent fashion so that all the minors of A will correspond to A. 

For 3x3 minors of A to comply with Aijk, note that as was obtained in Theorem|9] we need to have V{«, j, k} C 
{1, . . . ,n}, 2dijdjkdik = Cijk- Note that equations ( 1160b -( [T62l i give that \cijk\ = 2\dijdjkdik\ for all {i, j, k} C 
{1, . . . , n}. Again similar to Thereom|9] we may assume that all the off-diagonal terms in the first row are positive 
since we can always use the transformation DAD^ 1 where D is a diagonal matrix with ±1 elements to make d\j 
positive. Therefore assuming dij are positive, we can choose the sign of all off-diagonal terms djk, j, k £ {2, . . . , n} 
such that they have the same sign as Cuk. This way we guarantee 2d\jd\k0'jk = c ijk- However note that all the signs 
of all entries of A are now fixed and therefore we need to make sure the remaining conditions c^k = 2dijdikdjk 
for all {i,j,k} C {2,...,n} are also satisfied. However since c%jk = 2dijdxkdjk and A satisfies the constraint 
(1161b as well, c^k = 2dijdikdjk for {i,j, k} C {2,...,n} follows immediately. Therefore equations (11601 ) and 
(1161l > guarantee the equality of 3 x 3 minors of A with the corresponding entries of A. Note that property ( 1159b 
assures that none of the dij are zero and hence all the above steps for determining the sign will be valid. 

Now we need to prove the equality of all minors of size 4x4 and bigger of A with the relative entries of A. 
This is enforced by condition ( 1162b . To see the reason, replace each term A 7 , 7 C {1, . . . , n} in ( 1162b by dct A^y 
Then as we saw in part (a), the resulting equation describes det A^jkijua] — det Aym in terms of lower order 
minors which are already guaranteed to be equal to the corresponding entries of A. Therefore condition ( 1162b is 
enforcing det A^ — Ap. Note that for each f3 only 1 equation of type ( 1162b is required. Moreover due to property 
(1159b the denominators in (1162b obtained from (1163b will be nonzero and will not cause any problems. 

Finally note that there are (™~ x ) number of constraints of type (1160b . ("g 1 ) of type (1161b and YT m =A (m) of 
type ( 1162b which sums up to 2" — 1 — n ( n + 1 ) . This is the number that we expect noting that there are only "^" 2 +1 - ) 
free parameters in a symmetric matrix while the vector of principal minors is of size 2™ — 1. ■ 

Corollary 2: In Theorem [TOl if we insist that for all a C {1, . . . , n}, A a > and substitute each A a by e 9a 
in ( ll60l )-( fT62b then ( I160b -( ll62b give the necessary and sufficient conditions for a 2" — 1 dimensional vector to 
correspond to the entropies of n scalar jointly Gaussian random variables. 

Remark: Note that in order to characterize the entropy region of scalar Gaussian random variables what one really 
needs is the convex hull of all such entropy vectors. After all if we only wanted to determine whether 7 numbers 
correspond to the entropy vector of 3 scalar-valued jointly Gaussian random variables we could simply check 
whether they satisfy the hyperdeterminant relation ( 11391 ). However it is the convex hull which is more interesting, 
and more cumbersome to calculate, and this is what we addressed for 3 random variables in Section [HI] 

VI. Discussions and Conclusions 

In this paper, we studied the entropy region of jointly Gaussian random variables as an interesting subclass of 
continuous random variables. In particular we determined that the whole entropy region of 3 arbitrary continuous 
random variables can be obtained from the convex cone of the entropy region of 3 scalar-valued jointly Gaussian 
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random variables. We also gave the representation of the entropy region of 3 vector- valued jointly Gaussian random 
variables through a conjecture. 

We should remark that, in general, to characterize the entropy region of Gaussian random variables one should 
consider vector-valued random variables which is probably more complex than the case of scalars. In Section [Til] 
we showed that, for n = 3, the vector-valued random variables do not result in a bigger region than the convex hull 
of scalar ones. However in general it is not known whether the entropy region of n vector-valued jointly Gaussian 
random variables is greater than the convex hull of the entropy region of scalar valued Gaussians. 

For n>4we explicitly stated the set of 2" — 1 — constraints that a 2™ — 1 dimensional vector should 

satisfy in order to correspond to the entropy vector (equivalently the vector of all principal minors) of n scalar 
jointly Gaussian random variables. Although these conditions allow one to check whether a real vector of 2™ — 1 
numbers corresponds to an entropy vector of n scalar jointly Gaussian random variables, they do not reveal if such 
given vector of 2 n — 1 real numbers corresponds to the entropy vector of vector-valued Gaussian random variables 
or if it lies in the convex hull of scalar Gaussian entropy vectors. Answering these question requires one to study the 
region of vector valued jointly Gaussian random variables and this is what we addressed in Section III for 3 random 
variables. Obtaining the entropy region of vector-valued Gaussians seems to be rather complicated for n > 4 and 
as a satrting point one may instead focus on the convex hull of scalar Gaussians which is essentially the convex 
hull of vectors satisfying constraints (1 1 60b — (1 1 62b . Studying such a convex hull has an interesting connection to the 
concept of an "amoeba" in algebraic geometry. The "amoeba" of a polynomial f(x%, . . . , Xk) = J2i Q^i 1 ' ■ ■ ■ 
is defined as the image of / = in Mr under the mapping (xi , . . . , Xk) H> (log \x\ \ , . . . , log \xk\) ll33l . It turns out 
that many properties of amoebas can be deduced from the Newton polytope of / which is defined as the convex 
hull of the exponent vectors (pu, . . . ,Pki) m M, k ( see > e -g-> [38]). In terms of our problem of interest, the scalar 
Gaussian entropy points are the intersection of the amoebas associated to polynomials ( ll60b -( IT62b and one should 
look for the convex hull of the locus of these intersection points. If we allow the notion of amoeba to be defined 
as the log mapping for any function (not just polynomials), then one could also formulate our problem of interest 
as the convex hull of the amoeba of the algebraic variety obtained from the intersection of dl60b -( ll62b . 

Finally in characterizing the entropy region of Gaussian random variables for n > 4, we noted the important role 
of the hyperdeterminant and with this viewpoint we also examined the hyperdeterminant relations. In particular by 
giving a determinant formula for a multilinear form, we gave a transparent proof that the hyperdeterminant relation 
is satisfied by the principal minors of an n x n symmetric matrix. Moreover we also obtained a determinant form 
for the 2x2x2 hyperdeterminant which might be extendible to higher order formats and is an interesting problem 
even in its own right. 
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Appendix 
Proof of Lemma|4] 

Proof: We will first show that Mi,j, f{8) < XiXj, Let, 

3 



e(a) = -2 + 5>? + 2, 



\ n^ 1 - x t) (165) 

\ 1=1 



For distinct k C {1,2, 3}, this can also be written as, 



e(<5) = (a^)* - ((1 - x\ )(1 - a$) + (1 - 4) - 2^/(1 - x\ )(1 - sf )(1 - 4)) (166) 



(167) 



which shows e(<5) < (xiXj) s and therefore for all 8 > 0, /(<5) < XjXj with equality if and only if (1 — — a;|) = 
1 — 4 or equivalently, 

(^Xj) 5 + 4 - 4 - x j = (168) 

Note that this is only possible when XiXj = ^ax^xi • Without loss of generality assume, x\ < x 2 < x 3 , and 
define, 

y(S) = { Xl x 2 ) S + x s 3 - xi ~ x 5 2 (169) 

Clearly zeros of y(8) determine the global maximums of f{8) (i.e., f(8) — xix 2 ). Therefore we analyze the 
behavior of y(8) in the following scenarios (based on the assumption x\ < x 2 < X3): 
1) x\ < x 2 < X3 < 1: 

Note that for any number < x < 1, when i5 — >• 0, we have the approximation, x s w 1 + 8 log X Therefore 
as S — > + , we obtain, 

y(8)n 6logx 3 <0 (170) 

On the other hand when 5 — >• 00, we have y(<5) « £3 > 0. Therefore y(o") has at least one zero (which is not 
at origin). In fact we now show that it has exactly one zero. Let a(<5) = ^ = 1 + (^j^-) s - (^) 5 - (ff ) 5 - 
Then zeros of y(S) and a(8) match except possibly at infinity. Therefore we can equivalently determine the 
zeros of a(8). To do so, we further define b(5) = (f|) 5 • ^2 = x{ log(^) - (f^) 5 log(f^) - log(fj). 
Note that as 5 — > + , b(0 + ) w log £3 < 0. Moreover as 6" — > 00 we obtain 6(00) w log(||) > 0. On the 
other hand obtaining the derivative of b(S) gives, 



do \ x 3 J \x 2 J \x 2 J \x 3 

which has a unique zero 8* at 



Xl d log(aji) log I -^J - ^1 log ( -i ) log ( -i ) (171) 
l°g(t)log(t) 



log(x 1 )log(^f) 



(172) 



Calculating d j& ^ shows that 6(5) has a maximum at 5*. Noting that derivative of b(8) is defined for all 8 
yet it is only zero at 8* and that 6(0 + ) < and b(oo) > 0, we conclude that b(8) has exactly one zero at some 
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< 5 < 5*. Since we defined b(S) = (fl) 5 • da }P , we obtain that da £p has also exactly one zero at S (and 
is also possibly zero at infinity). Moreover we have S -> 0+, a(0+) « S\ogx 3 < 0, |f(0 + ) » logx 3 < 
and a(oo) « 1. Since a(<5) is also everywhere differentiable we deduce that a(8) has exactly one zero (that is 
not at origin) at some So where So > S > 0. Finally going back to y(S), we obtain that y(S) starts at origin, 
i.e., y(0) = however with a negative slope ^f(0 + ) = log £3 < 0. It has a unique zero at <5o > and it 
approaches the (5-axis again at infinity with a positive sign, i.e., y(8) x 3 — > + as S — ► 00. Therefore we 
have shown that ?/(<5)'s behavior is as the one depicted in Fig. 12a). 

Note that since the zeros of y(5) determine the global maximums of f(S), to determinne where the global 
maximums of f(S) occur, we need to calculate f(S) at 0, 00 and So- First for S — > + we have, 

f(S) « (1 + S\og(xix 2 x 3 ))^ -> x x x 2 x 3 (173) 

Moreover for S — >• 00 we have (1 — x s ) i «1- ^x 5 — ^x 25 for < x < 1. Therefore we can write, 

e(5) « -2 + ^ a f + 2(l-ix{-i a ^)(l-i^-i a ^)(l-i B S-i a ^) (174) 

« i^) 5 + i(.T ia ;3) 5 + i^a*) 4 - ±af - - J*f (175) 

Since 5 — > 00 and x\ < x 2 < X3, the term x 3 s will be the dominant term and we will have, 

e{5) » --xf < (176) 

Therefore as i -> 00 we obtain /(<5) = ( max(0,e(5)) )* = 0. As a result /(<$) has a unique global 
maximizer at 60 where f(So) — x\x 2 . Note in Fig. |2J a ) that the zero of y(S) coincides with the maximum 
of f(S). 

2) x\ — x 2 < x 3 < 1: 

In this case we have y(S) = x\ & + x 3 — 2xf. Similar to the previous case we define a(S) — — 

S 5 6 6 ^ 

(3) +1 - 2 fe) and analyze the zeros of a(<5). Note that ^i = (^) log(^)- 2 (g) log (fi) 
which has a unique zero at the point <5 given by, 

a 2 log 2i 

Since <z(5) <51oga;3 < as S — >• + and a(oo) « 1 and again a(<5) is everywhere differnetiable, we 
obtain that a(5) has exactly one zero (that is not at origin) at some 6q where So > S > 0. Therefore y{5) 
has also exactly one zero (that is not at origin) at S - Again we have y(S) w 5logx 3 < as 5 — > + and 
2/(oo) w x| > and hence its behavior is again as the one depicted in Fig. EJb). 

The analysis of e{8) and f(8) is similar to the last case. In particular we have, /(0 + ) » x^a^ and e(oo) « 



~I X 3? < giving /(<5) = for <5 —J- 00. Hence /(<5) has a unique maximizer at <5o sucn that /(<5o) 
Again it is evident from Fig. |2jb) that the zero of y(S) coincides with the maximum of f(5). 

3) x\ < x 2 — x 3 < 1: 
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2. Functions f(S) and y(S) versus 8. The solid line shows function /(<5) and the dashed line shows function y[S). 



In this case y(S) = {x\x^) s — x\ < 0. Calculating V X ' = (xixz) s log(xia:3) — x\ log x\ which has a unique 
zero at 6 given by, 



logxi 



(178) 



3 log(a;iX3) 

Moreover as 5 — > + we have y(S) « Slogx^ — > _ and as 6 — > oo, y{5) ~ —x\ — > _ . Therefore the 
behavior of y($) in this case is as shown in Fig. [2{c). 

The analysis of e(S) and f(5) is again similar to the previous 2 cases. However note that in this case the 
only zero of y(8) is at origin and as a result we only need to consider the value of f(5) at and infinity. 
Following a similar procedure as in the previous cases, we obtain /(0 + ) ~ x±x\ and for 5 — > oo by replacing 
%2 = in ( 1 175b we get, 

(179) 



e(S) » (x lX3 ) 5 - X -xf w (xixs) 
Therefore f(oo) w a^iXa, i.e., approaches its global maximum at infinity. 
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4) x\ = x 2 = x 3 < 1: 

In this case, y(S) = x\ s — x% < and we have dy ^P = 2x^ 6 'log a; 3 — xf\ogx 3 which again has a unique 
zero at 5 given by, 0:3 = |. As in the previous case, y(0 + ) « <5 log 2:3 — > 0~ and y(oo) s» — X3 — » 0~ and 
y(<5) behaves as in the previous case (Fig. EJd)). 

Since zeros of y(S) happen at and infinity, we only need to calculate the value of /(<5) at and infinity. For 
S — > + we have, /(0 + ) « x 3 . On the other hand by replacing x\ = x 2 = X3 in d 1 75b we obtain as S — > 00, 

e{6) « \xf (180) 

This yields f(S) « (§) ? x| « x| as <5 — s- 00. As a result /(<$) again approaches its global maximum at 
infinity. 

5) x\ < X2 < X3 = 1: 

In this case y(S) simplifies to y(S) = {x^) 5 + 1 - x{ - x 5 2 = (1 - acf )(1 - a;f) > 0. Note that y(0) = 
0, y(oo) — > 1 and y(8) is an increasing function. The behavior of y(S) is shown in Fig. |2|e). 
Since the zero of y{5) occurs at zero, we only need to evaluate f(8) at zero for which we obtain e(0 + ) ~ 
1 + <51og(xi2;2) and therefore /(0 + ) ~ x%X2, i.e., the global maximum of /(<5) is at zero. 

6) x\ — X2 < X3 — 1: 

In this case we have y(5) = x\ s + 1 — 2x\ = (1 — a;^) 2 > 0. Behavior of y{5) is shown in Fig. |2jf)- 

For this scenario again we only need to care for /(0) which can be easily obtained to be /(0 + ) ~ x\ which 

is the global maximum. 

7) x\ < x 2 — x 3 — 1: 

Here we obtain y(5) = a constant function. To evaluate /(<$) we do not need to use y(S) in this case. In 
fact we have f(6) = x\ which is a constant function as well and equal to its global maximum everywhere. 
Thus far we showed that f(5) (except for the case when x\ < x 2 = ^3 = 1 and f(S) is a constant) it has a 
unique global maximizer at which f(S) = min^- XiXj. Moreover in all these cases, if for some 6 > we have 
y(S) < it can be seen that maximum of f(8) occurs for some Sq > S. Noting that y as defined in the statement 
of the theorem is in fact y(l), it immediately follows that if y < then f(S) attains its global maximum for some 
S> 1. ■ 
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